Thursday, October 27, 2005

Where am I?

How can an application or a library find out its own location on Linux?

Suppose you want an application for Linux to work after being just unpacked to some directory. No setting of COOL_APP_HOME environment variable, no creation of ~/.coolapprc, nothing. If the application is one self-contained binary, no problem, but if there are some resources (icons, or translations, or random data files), you need to find them somehow. And since the only action user did was unpacking the application, you can only find resources by using a path relative to the application's path. But finding application path is not straightforward.

The obvious approach is using argv[0]. But not quite simple, because argv[0] can be:

  1. An absolute path. You just strip file name from it and get the path to the application.
  2. A relative path. You need to join current directory with that relative path, and strip file name.
  3. Application name without the path. Could happen if application was found via the PATH environment variable. It's necessary to iterate over all PATH elements, trying to find the application there.
  4. Anything. The value of argv[0] is specified in exec* call and can be absolutely anything.

A more reliable method is using the /proc filesystem. The /proc/<pid>/exe is a symbolic link the the application, and /proc/self is the same as /proc/<pid of current application>. So, calling readlink on /proc/self/exe gives the desired effect.

But it does not work for shared libraries. Shared library can have its own resources, and might want to find them relatively to library path. Using /proc/self/exe will give the path to the application, not to the library. The solution here is the dladdr function, which is Linux extension to the dynamic loader interface. Here's example use:

std::string where_am_i()
{
    DL_info info;
    if (dladdr( &where_am_i, &info ) == 0)
    {
        return info.dli_fname; 
    }
    ....
}

The function takes a code address, and returns information about shared library this address belongs to. So, if where_am_i function is defined in a shared library, the above code will return the path to that library. Unfortunately, this works only for dynamic libraries, but not for application. So, for a really reliable solution one has to combine the /proc/self/exe trick with the dladdr trick.

The only problem is that both tricks are specific to Linux. Why such a basic functionality not in POSIX?

The last interesting case is when the "resource" your applications uses is a shared library with on-startup linking (not explicit dlopen linking). The path to the library should be added to dynamic linker search path before the application is started, so the above tricks won't help. Forgunately, it's not necessary to create helper applications or scripts, you just need an extra options when linking:

g++ -o executable -Wl,-R -Wl,'$ORIGIN' executable.o libhelper.so

The -Wl,-R -Wl,'$ORIGIN' options adds a new element "$ORIGIN" to dynamic library search path in the executable, and the dynamic linker will replace $ORIGIN with the path of the executable.

With all those tricks in hand, it's not longer needed to know beforehand the directory where the application will be installed. But I'd still prefer nice builtin support, like Mac OSX bundles.

Tuesday, October 04, 2005

Black or white?

What's better: black box testing, or white box testing? In general, neither, but sometimes peeking at internals can be a great help.

One of the project at work now is writing a compiler for simplified C (basically, C without structures). And one recent addition was declaring several variables in one declaration. The existing test was:

int32 v, v2, v3;

int32 main()
{
    v3 = 15;
    v2 = v = 10;
    printf("Result = %d\n", v + v2 + v3);
    return 0;
}
and it worked immediately. But there's was a bug, and just as experiment, I've asked the author of the test to find the bug. His attempts were:
  1. Using duplicate name of variable: int32 v, v2, v3, v;. That produced an error, as expected.
  2. Defining extra local variable "v2". Still no problems.
  3. Initializing some of the variables at definition point, not inside the function.
  4. Trying different type of variables.
  5. Moving the code from main to another function.

At this point, he gave up. The real problem occured at this example:

int32 xi = 100, i = xi + 5;

int32 main()
{
    printf("Result = %d\n", i);
}
and was caused by creating variables in the reverse order. The language grammar is written in the way that makes traversal in the reverse order more natural. And so variable i was initialized before variable xi. Honestly, I can't blaim the tester. It's a kind of bug you can think about only if you know that nonterminals can be left-recursive and right-recursive. Or see the code.